Robust Statistical Ranking: Theory and Algorithms

نویسندگان

  • Qianqian Xu
  • Jiechao Xiong
  • Qingming Huang
  • Yuan Yao
چکیده

Deeply rooted in classical social choice and voting theory, statistical ranking with paired comparison data experienced its renaissance with the wide spread of crowdsourcing technique. As the data quality might be significantly damaged in an uncontrolled crowdsourcing environment, outlier detection and robust ranking have become a hot topic in such data analysis. In this paper, we propose a robust ranking framework based on the principle of Huber’s robust statistics, which formulates outlier detection as a LASSO problem to find sparse approximations of the cyclic ranking projection in Hodge decomposition. Moreover, simple yet scalable algorithms are developed based on Linearized Bregman Iteration to achieve an even less biased estimator than LASSO. Statistical consistency of outlier detection is established in both cases which states that when the outliers are strong enough and in Erdös-Rényi random graph sampling settings, outliers can be faithfully detected. Our studies are supported by experiments with both simulated examples and real-world data. The proposed framework provides us a promising tool for robust ranking with large scale crowdsourcing data arising from computer vision, multimedia, machine learning, sociology, etc.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust uncapacitated multiple allocation hub location problem under demand uncertainty: minimization of cost deviations

The hub location–allocation problem under uncertainty is a real-world task arising in the areas such as public and freight transportation and telecommunication systems. In many applications, the demand is considered as inexact because of the forecasting inaccuracies or human’s unpredictability. This study addresses the robust uncapacitated multiple allocation hub location problem with a set of ...

متن کامل

A New Hybrid Method for Web Pages Ranking in Search Engines

There are many algorithms for optimizing the search engine results, ranking takes place according to one or more parameters such as; Backward Links, Forward Links, Content, click through rate and etc. The quality and performance of these algorithms depend on the listed parameters. The ranking is one of the most important components of the search engine that represents the degree of the vitality...

متن کامل

Ranking Function Discovery by Genetic Programming for Robust Retrieval

Ranking functions are instrumental for the success of an information retrieval (search engine) system. However nearly all existing ranking functions are manually designed based on experience, observations and probabilistic theories. This paper tested a novel ranking function discovery technique proposed in [Fan 2003a, Fan2003b] – ARRANGER (Automatic geneRation of RANking functions by GEnetic pR...

متن کامل

A robust aggregation operator for multi-criteria decision-making method with bipolar fuzzy soft environment

Molodtsov initiated soft set theory that provided a general mathematicalframework for handling with uncertainties in which we encounter the data by affix parameterized factor during the information analysis as differentiated to fuzzy as well as bipolar fuzzy set theory.The main object of this paper is to lay a foundation for providing a new application of bipolar fuzzy soft tool in ...

متن کامل

Iterative Methods for Ranking Students with Noisy Questions

We study the problem of ranking students by their abilities, solely based on responses to studentsourced multiple-choice questions. This addresses the crucial problem of scaling automatic assessment of students to very large class sizes. Current state-of-the-art methods (i) assume student responses obey a parameterized model, (ii) were designed for situations with trusted questions, and (iii) a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1408.3467  شماره 

صفحات  -

تاریخ انتشار 2014